target vector
Beyond Single Embeddings: Capturing Diverse Targets with Multi-Query Retrieval
Chen, Hung-Ting, Liu, Xiang, Ravfogel, Shauli, Choi, Eunsol
Most text retrievers generate one query vector to retrieve relevant documents. Y et, the conditional distribution of relevant documents for the query may be multi-modal, e.g., representing different interpretations of the query. We first quantify the limitations of existing retrievers. All retrievers we evaluate struggle more as the distance between target document embeddings grows. Our model autoregressively generates multiple query vectors, and all the predicted query vectors are used to retrieve documents from the corpus. We show that on the synthetic vectorized data, the proposed method could capture multiple target distributions perfectly, showing 4x better performance than single embedding model. We also fine-tune our model on real-world multi-answer retrieval datasets and evaluate in-domain. AMER presents 4 and 21% relative gains over single-embedding baselines on two datasets we evaluate on. Furthermore, we consistently observe larger gains on the subset of dataset where the embeddings of the target documents are less similar to each other. We demonstrate the potential of using a multi-query vector retriever and open up a new direction for future work. As large language models (LLMs) have limited, out-dated parametric knowledge, augmenting knowledge at inference time by prepending retrieved documents has risen as a de facto solution (Fan et al., 2024; Gao et al., 2023). Recovering a diverse set of documents is crucial to provide comprehensive information (Xu et al., 2023), as an answer providing partial information can be technically correct yet misleading to users. In this work, we study retrieving a diverse set of documents per query. We first analyze the behaviors of existing retrievers (Izacard et al., 2022; Y ang et al., 2025b; Zhang et al., 2025; Lee et al., 2025a) on datasets (Min et al., 2020; Amouyal et al., 2023) containing questions that admit multiple valid answers.
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > Dominican Republic (0.04)
- North America > Canada > Ontario > Middlesex County > London (0.04)
- (5 more...)
- Media > Film (0.68)
- Leisure & Entertainment (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
Chen, Jiale, Shabanzadeh, Yalda, Crnčević, Elvir, Hoefler, Torsten, Alistarh, Dan
Quantizing the weights of large language models (LLMs) from 16-bit to lower bitwidth is the de facto approach to deploy massive transformers onto more affordable accelerators. While GPTQ emerged as one of the standard methods for one-shot post-training quantization at LLM scale, its inner workings are described as a sequence of ad-hoc algebraic updates that obscure geometric meaning or worst-case guarantees. In this work, we show that, when executed back-to-front (from the last to first dimension) for a linear layer, GPTQ is mathematically identical to Babai's nearest plane algorithm for the classical closest vector problem (CVP) on a lattice defined by the Hessian matrix of the layer's inputs. This equivalence is based on a sophisticated mathematical argument, and has two analytical consequences: first, the GPTQ error propagation step gains an intuitive geometric interpretation; second, GPTQ inherits the error upper bound of Babai's algorithm under the assumption that no weights are clipped. Leveraging this bound, we design post-training quantization methods that avoid clipping, and outperform the original GPTQ. In addition, we provide efficient GPU inference kernels for the resulting representation. Taken together, these results place GPTQ on a firm theoretical footing and open the door to importing decades of progress in lattice algorithms towards the design of future quantization algorithms for billion-parameter models.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Austria (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
Defending Deep Regression Models against Backdoor Attacks
Du, Lingyu, Liu, Yupei, Jia, Jinyuan, Lan, Guohao
Deep regression models are used in a wide variety of safety-critical applications, but are vulnerable to backdoor attacks. Although many defenses have been proposed for classification models, they are ineffective as they do not consider the uniqueness of regression models. First, the outputs of regression models are continuous values instead of discretized labels. Thus, the potential infected target of a backdoored regression model has infinite possibilities, which makes it impossible to be determined by existing defenses. Second, the backdoor behavior of backdoored deep regression models is triggered by the activation values of all the neurons in the feature space, which makes it difficult to be detected and mitigated using existing defenses. To resolve these problems, we propose DRMGuard, the first defense to identify if a deep regression model in the image domain is backdoored or not. DRMGuard formulates the optimization problem for reverse engineering based on the unique output-space and feature-space characteristics of backdoored deep regression models. We conduct extensive evaluations on two regression tasks and four datasets. The results show that DRMGuard can consistently defend against various backdoor attacks. We also generalize four state-of-the-art defenses designed for classifiers to regression models, and compare DRMGuard with them. The results show that DRMGuard significantly outperforms all those defenses. Regression techniques are widely used to solve tasks where the goal is to predict continuous values. Unsurprisingly, similar to their classification counterparts, regression techniques have been revolutionized with deep learning and have achieved the state-of-the-art result in many real-world applications.
- North America > United States > Pennsylvania (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Asia > Nepal (0.04)
Streaming Compression of Scientific Data via weak-SINDy
Russo, Benjamin P., Laiu, M. Paul, Archibald, Richard
In this paper a streaming weak-SINDy algorithm is developed specifically for compressing streaming scientific data. The production of scientific data, either via simulation or experiments, is undergoing an stage of exponential growth, which makes data compression important and often necessary for storing and utilizing large scientific data sets. As opposed to classical "offline" compression algorithms that perform compression on a readily available data set, streaming compression algorithms compress data "online" while the data generated from simulation or experiments is still flowing through the system. This feature makes streaming compression algorithms well-suited for scientific data compression, where storing the full data set offline is often infeasible. This work proposes a new streaming compression algorithm, streaming weak-SINDy, which takes advantage of the underlying data characteristics during compression. The streaming weak-SINDy algorithm constructs feature matrices and target vectors in the online stage via a streaming integration method in a memory efficient manner. The feature matrices and target vectors are then used in the offline stage to build a model through a regression process that aims to recover equations that govern the evolution of the data. For compressing high-dimensional streaming data, we adopt a streaming proper orthogonal decomposition (POD) process to reduce the data dimension and then use the streaming weak-SINDy algorithm to compress the temporal data of the POD expansion. We propose modifications to the streaming weak-SINDy algorithm to accommodate the dynamically updated POD basis. By combining the built model from the streaming weak-SINDy algorithm and a small amount of data samples, the full data flow could be reconstructed accurately at a low memory cost, as shown in the numerical tests.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Perception Helps Planning: Facilitating Multi-Stage Lane-Level Integration via Double-Edge Structures
You, Guoliang, Chu, Xiaomeng, Duan, Yifan, Zhang, Wenyu, Li, Xingchen, Zhang, Sha, Li, Yao, Ji, Jianmin, Zhang, Yanyong
When planning for autonomous driving, it is crucial to consider essential traffic elements such as lanes, intersections, traffic regulations, and dynamic agents. However, they are often overlooked by the traditional end-to-end planning methods, likely leading to inefficiencies and non-compliance with traffic regulations. In this work, we endeavor to integrate the perception of these elements into the planning task. To this end, we propose Perception Helps Planning (PHP), a novel framework that reconciles lane-level planning with perception. This integration ensures that planning is inherently aligned with traffic constraints, thus facilitating safe and efficient driving. Specifically, PHP focuses on both edges of a lane for planning and perception purposes, taking into consideration the 3D positions of both lane edges and attributes for lane intersections, lane directions, lane occupancy, and planning. In the algorithmic design, the process begins with the transformer encoding multi-camera images to extract the above features and predicting lane-level perception results. Next, the hierarchical feature early fusion module refines the features for predicting planning attributes. Finally, the double-edge interpreter utilizes a late-fusion process specifically designed to integrate lane-level perception and planning information, culminating in the generation of vehicle control signals. Experiments on three Carla benchmarks show significant improvements in driving score of 27.20%, 33.47%, and 15.54% over existing algorithms, respectively, achieving the state-of-the-art performance, with the system operating up to 22.57 FPS.
- Asia > China > Anhui Province > Hefei (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (6 more...)
Fast Redescription Mining Using Locality-Sensitive Hashing
Karjalainen, Maiju, Galbrun, Esther, Miettinen, Pauli
A redescription is a pattern that characterises roughly the same entities in two different ways, and redescription mining is the task of automatically extracting redescriptions from the input dataset, given user-defined constraints. Redescription mining has found applications in various fields of science, such as ecometrics. Ecometrics aims to identify and model the functional relationships between traits of organisms and their environments [5, 7]. For instance, the teeth of large plant-eating mammals are adapted to the food that is available in their environment, which in turn depends on the climatic conditions, potentially allowing one to reason about the climate in the past based on the fossil record. To apply redescription mining in this context, the entities in the dataset represent localities, with two sets of attributes recording respectively the distribution of dental traits among species and the climatic conditions at each locality [11, 19].
- North America > United States (0.46)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Finland (0.04)
- (3 more...)
9232fe81225bcaef853ae32870a2b0fe-Reviews.html
The key idea of this paper is to use the latent vector generated from upper layers of a Deep Belief Network as target vectors, by adding a regularization cross-entropy part to the standard unsupervised training loss, so as to encourage reconstructions from bottom and top to match. Instead of the standard two-stage training of deep architectures (unsupervised layer-by-layer pretraining, then full network supervised training), training is conducted in three stages with an intermediate stage that uses this hybrid loss. Experimental results show that the intermediate stage substantially improves results on MNIST and Caltech101. This regularization of intermediate layers by top-down generative weights shows good results; the paper is clear and shows how the proposed intuitive way of bridging unsupervised and supervised training indeed improves performance. The paper is overall rather clear although there are some problems (see below).
Distributed Estimation with Partially Accessible Information: An IMAT Approach to LMS Diffusion
Shamsi, Mahdi, Marvasti, Farokh
Distributed algorithms, particularly Diffusion Least Mean Square, are widely favored for their reliability, robustness, and fast convergence in various industries. However, limited observability of the target can compromise the integrity of the algorithm. To address this issue, this paper proposes a framework for analyzing combination strategies by drawing inspiration from signal flow analysis. A thresholding-based algorithm is also presented to identify and utilize the support vector in scenarios with missing information about the target vector's support. The proposed approach is demonstrated in two combination scenarios, showcasing the effectiveness of the algorithm in situations characterized by sparse observations in the time and transform domains.
Improving novelty detection with generative adversarial networks on hand gesture data
Simão, Miguel, Neto, Pedro, Gibaru, Olivier
We propose a novel way of solving the issue of classification of out-of-vocabulary gestures using Artificial Neural Networks (ANNs) trained in the Generative Adversarial Network (GAN) framework. A generative model augments the data set in an online fashion with new samples and stochastic target vectors, while a discriminative model determines the class of the samples. The approach was evaluated on the UC2017 SG and UC2018 DualMyo data sets. The generative models performance was measured with a distance metric between generated and real samples. The discriminative models were evaluated by their accuracy on trained and novel classes. In terms of sample generation quality, the GAN is significantly better than a random distribution (noise) in mean distance, for all classes. In the classification tests, the baseline neural network was not capable of identifying untrained gestures. When the proposed methodology was implemented, we found that there is a trade-off between the detection of trained and untrained gestures, with some trained samples being mistaken as novelty. Nevertheless, a novelty detection accuracy of 95.4% or 90.2% (depending on the data set) was achieved with just 5% loss of accuracy on trained classes.